Fix #36: Add initial Fathom rules with 98.7% average training accuracy #45

biancadanforth · 2018-08-12T04:23:35Z

@Osmose , ready for review! I'm going to put my conclusions in #36 shortly, but basically, I think we should move ahead with Fathom with a CSS selector fallback.

Builds off #38 with a more extensive ruleset based off a training corpus (available on the Commerce Google drive in the Engineering subfolder) of 50 product pages from our top five sites.

Training:

98.7% average accuracy across all 3 features for the training set (100% title, 100% image, 96% price).

Coefficients and values:

(Pulled from ./src/fathom_coefficients.json)

{
  "largerImageCoeff": 2,
  "largerFontSizeCoeff": 7,
  "hasDollarSignCoeff": 8,
  "hasPriceInIDCoeff": 17,
  "hasPriceInClassNameCoeff": 2,
  "isAboveTheFoldPriceCoeff": 33,
  "isAboveTheFoldImageCoeff": 13,
  "isNearbyImageXAxisPriceCoeff": 5,
  "isNearbyImageYAxisTitleCoeff": 5,
  "hasPriceishPatternCoeff": 15
}

Image - best accuracy: 100%

Price - best accuracy: 94%

Title - best accuracy: 100%

erikrose · 2018-08-14T15:57:08Z

src/fathom_extraction.js

+  const extractedElements = {};
+  const rules = productRuleset.get('product').rulesetMaker;
+  for (const feature of PRODUCT_FEATURES) {
+    let fnodesList = rules([


There is a bit of compilation that happens on ruleset construction, so, for speed, you might want to eventually construct this once, in global scope, and keep it around.

Ah thanks; makes sense. What's happening behind the scenes there?

Osmose

Sorry for the delay in reviewing this. Nice work on getting this to an acceptable accuracy in such short time!

Osmose · 2018-08-14T00:00:56Z

src/fathom_extraction.js

+      isAboveTheFoldImageCoeff,
+      isNearbyImageXAxisCoeff,
+      hasPriceishPatternCoeff,
+    ]).against(doc).get(`${feature}`);


This re-creates and re-runs the ruleset for each feature, right? If this is all in a single ruleset we can just run it once, can't we?

The string interpolation for feature has no effect here, just pass it directly.

Osmose · 2018-08-14T00:02:13Z

src/fathom_extraction.js


-const SCORE_THRESHOLD = fathomCoeffs.hasPriceClass;
+const PRODUCT_FEATURES = ['title', 'price', 'image'];
+const SCORE_THRESHOLD = 4;


This is defined in two spots, can we define it once and import it?

Osmose · 2018-08-14T00:06:05Z

src/fathom_extraction.js

+      extractedProduct[feature] = (feature === 'image'
+        ? extractedElements[feature].src
+        : extractedElements[feature].innerText
+      );


Ternary statements aren't syntactically clear, so it's worth avoiding them unless they are clearer to read than the alternative. Same for the return statement below.

Osmose · 2018-08-14T00:08:27Z

src/fathom_ruleset.js

+ * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
+
+import {dom, out, rule, ruleset, score, type} from 'fathom-web';
+import {ancestors} from 'fathom-web/utils'; // for training: utilsForFrontend


I don't understand the comment here.

For training, the trainer add-on imports Fathom as a submodule, and its utilities file is called "utilsForFrontend" rather than "utils".

This is one of the two action items @erikrose was going to work on. I will make it more clear in the meantime.

Osmose · 2018-08-14T00:13:58Z

src/fathom_ruleset.js

+            || style.display === 'none'
+            || style.opacity === '0'
+            || style.width === '0'
+            || style.height === '0') {


Storing large conditions like this in a variable makes it easier to read the if statement:

const isElementHidden = ( style.visibility === 'hidden' || style.display === 'none' || style.opacity === '0' || style.width === '0' || style.height === '0' ); if (elementIsHidden) { return false; }

Osmose · 2018-08-15T19:54:24Z

src/fathom_ruleset.js

+        const imageElement = fnode._ruleset.get('image')[0].element; // eslint-disable-line no-underscore-dangle
+        const imageDOMRect = imageElement.getBoundingClientRect();
+        const deltaX = eleDOMRect.left - imageDOMRect.right;
+        // priceish element is always* to the right of the image


You mean in the context of our top 5? I don't think we should make that assumption. Can we not just take the absolute value of the delta?

Taking the absolute value of the delta alone won't work, as if the element overlaps with the image element in the x-axis, it would score higher than an element that doesn't. However, I can calculate a deltaX based on whether the element is to the left or right of the image (without overlapping) to cover cases where the element is to the left of the image.

Osmose · 2018-08-15T19:56:04Z

src/fathom_ruleset.js

+       */
+      function hasPriceishPattern(fnode) {
+        const text = fnode.element.innerText;
+        /**


The two asterisks are used for doc comments at the start of a function/class/variable. This is just a regular comment and should use //.

I automatically associate // with single line comments and /* */ with multi-line comments.

A /* */ comment would also be fine. A block comment that starts with two stars (/**) instead of one (/*) is specifically a doc comment and should follow JSDoc conventions, and should only occur either immediately above a variable/class/function/module that it is documenting, or be one of the few exceptions (such as type definitions).

Osmose · 2018-08-15T19:56:52Z

src/fathom_ruleset.js

+      /**
+       * Checks if fnode is nearby the top scoring image element in the y-axis
+       */
+      function isNearbyImageYAxis(fnode) {


Why is the Y axis used for eligibility but the X axis is used for scoring?

Y-axis
For price, there is little correlation in vertical proximity to the image, as some pages' price is toward the top of the image, and in others, it's toward the middle. On a lot of pages, like the one below, there are lots of priceish candidates that could score higher than the correct price depending on whether I choose the center or top of the image as the reference point.

The title is consistently near the top of the image, so it might benefit from being scored by vertical proximity.

X-axis
For the X-axis, there is a reliable correlation between x-proximity and the correct price (see example page above). I don't use this function to score title as I did not see a correlation there (also titles can be above the image, which breaks this rule).

FWIW: I just tried making the X function a binary check like the Y function, and I got much worse accuracy (80% compared to 94% on price).
I also tried making the Y function a fuzzier measure, scoring it like the X. Unfortunately for price, on some pages it's closer to the top of the image, on others it's closer to the center or center bottom. Accuracy on price went down from 94% to 74%. Interestingly for title, the title elements are consistently closer to the top of the image, and making that change brought accuracy up from 96% to 100%.

Conclusion
In conclusion, I am going to change the binary Y-axis rule to a score-based rule for title, and I am going to leave the binary Y-axis rule as-is for price. I am also going to leave the score-based X-axis rule as-is for price, though I renamed it for clarity, since that rule is only used for price and not other features.

It might be worth leaving a short comment explaining that Y axis scoring has worse results.

Osmose · 2018-08-15T19:58:09Z

src/fathom_ruleset.js

+        const DOMRect = element.getBoundingClientRect();
+        const imageElement = fnode._ruleset.get('image')[0].element; // eslint-disable-line no-underscore-dangle
+        const imageDOMRect = imageElement.getBoundingClientRect();
+        if (DOMRect.top >= (imageDOMRect.top - TOP_BUFFER)


In general it's more readable when splitting stuff into multiple lines to not put part of what you're splitting on the first line. Maybe change this to:

if ( DOMRect.top >= (imageDOMRect.top - TOP_BUFFER) && DOMRect.bottom <= imageDOMRect.bottom ) {

Or even better, store the comparison in a variable that describes what it's computing.

Osmose · 2018-08-15T19:59:36Z

src/fathom_ruleset.js

+         * Image rules
+         *
+         * If training, leave uncommented, as 'price' and 'title' rules depend
+         * on the `out` of these 'image' rules.


I understand the dependency and the limitation that FathomFox can only train based on a single type, but if it's okay to leave this uncommented during a training run, why do we need to comment out the other ones during a training run?

I just checked this, and you're right that we don't need to comment out rules in a ruleset that we're not using to train a particular feature. Will remove those comments.

biancadanforth

My replies to feedback from Osmose and erikrose.

Thank you for the review! I will file one commit for the majority of changes, and a second commit for larger structural changes recommended in this comment and this comment.

biancadanforth · 2018-08-16T17:48:15Z

src/fathom_extraction.js

+  const extractedElements = {};
+  const rules = productRuleset.get('product').rulesetMaker;
+  for (const feature of PRODUCT_FEATURES) {
+    let fnodesList = rules([


Ah thanks; makes sense. What's happening behind the scenes there?

biancadanforth · 2018-08-16T17:54:08Z

src/fathom_ruleset.js

+ * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
+
+import {dom, out, rule, ruleset, score, type} from 'fathom-web';
+import {ancestors} from 'fathom-web/utils'; // for training: utilsForFrontend


For training, the trainer add-on imports Fathom as a submodule, and its utilities file is called "utilsForFrontend" rather than "utils".

This is one of the two action items @erikrose was going to work on. I will make it more clear in the meantime.

biancadanforth · 2018-08-16T18:29:08Z

src/fathom_ruleset.js

+ * Rulesets to train.
+ *
+ * Drop this file into the fathom-trainees/src folder (replacing the default file)
+ * to train Fathom against this ruleset.


After talking with Osmose in slack to clarify this, I will submit a separate commit to this PR to address this issue with more detail there.

biancadanforth · 2018-08-16T18:40:28Z

src/fathom_ruleset.js

+      function largerFontSize(fnode) {
+        const sizeWithUnits = window.getComputedStyle(fnode.element).fontSize;
+        const size = sizeWithUnits.replace('px', '');
+        if (size) {


Interesting! I didn't know that; seems to check out though.

biancadanforth · 2018-08-16T18:42:57Z

src/fathom_ruleset.js

+      function hasPriceInID(fnode) {
+        const element = fnode.element;
+        const parentElement = element.parentElement;
+        const ID = element.id;


You have said previously that abbreviations/acronyms should be capitalized, so that's why I have it capitalized. Perhaps it becomes confusing though when there are no other terms in the variable name? e.g. ID versus parentID?

biancadanforth · 2018-08-16T22:47:09Z

src/fathom_ruleset.js

+      /**
+       * Checks if fnode has the same innerText as any of its children
+       */
+      function removeRedundantAncestors(fnode) {


I can rename the method to hasDifferentInnerTextThanChildren, since that's really what's being checked.

Effectively what this does is remove elements who have the same innerText as their child, since it is used in a when clause in a rule further down in this file.

I observed that a lot of <div> and <span> elements are style wrappers on other elements. Rather than keeping all of those "redundant" innerText elements in the game and scoring them (they'd end up with the same or similar score as the innermost element), I'd like to eliminate them right at the beginning. This has two benefits:

Performance: the more elements you eliminate from scoring, the less computation is needed to find the highest scoring element.

Training: When I tag an element as the "correct" price element, I can only choose one for each feature. I was getting wrong accuracy numbers, since the value of the innerText was correct, but the specific element Fathom was choosing would vary from the innermost element to one of its ancestors.

biancadanforth · 2018-08-16T23:11:23Z

src/fathom_ruleset.js

+        const imageElement = fnode._ruleset.get('image')[0].element; // eslint-disable-line no-underscore-dangle
+        const imageDOMRect = imageElement.getBoundingClientRect();
+        const deltaX = eleDOMRect.left - imageDOMRect.right;
+        // priceish element is always* to the right of the image


Taking the absolute value of the delta alone won't work, as if the element overlaps with the image element in the x-axis, it would score higher than an element that doesn't. However, I can calculate a deltaX based on whether the element is to the left or right of the image (without overlapping) to cover cases where the element is to the left of the image.

biancadanforth · 2018-08-16T23:44:52Z

src/fathom_ruleset.js

+       */
+      function hasPriceishPattern(fnode) {
+        const text = fnode.element.innerText;
+        /**


I automatically associate // with single line comments and /* */ with multi-line comments.

biancadanforth · 2018-08-17T00:04:11Z

src/fathom_ruleset.js

+      /**
+       * Checks if fnode is nearby the top scoring image element in the y-axis
+       */
+      function isNearbyImageYAxis(fnode) {


Y-axis
For price, there is little correlation in vertical proximity to the image, as some pages' price is toward the top of the image, and in others, it's toward the middle. On a lot of pages, like the one below, there are lots of priceish candidates that could score higher than the correct price depending on whether I choose the center or top of the image as the reference point.

The title is consistently near the top of the image, so it might benefit from being scored by vertical proximity.

X-axis
For the X-axis, there is a reliable correlation between x-proximity and the correct price (see example page above). I don't use this function to score title as I did not see a correlation there (also titles can be above the image, which breaks this rule).

FWIW: I just tried making the X function a binary check like the Y function, and I got much worse accuracy (80% compared to 94% on price).
I also tried making the Y function a fuzzier measure, scoring it like the X. Unfortunately for price, on some pages it's closer to the top of the image, on others it's closer to the center or center bottom. Accuracy on price went down from 94% to 74%. Interestingly for title, the title elements are consistently closer to the top of the image, and making that change brought accuracy up from 96% to 100%.

Conclusion
In conclusion, I am going to change the binary Y-axis rule to a score-based rule for title, and I am going to leave the binary Y-axis rule as-is for price. I am also going to leave the score-based X-axis rule as-is for price, though I renamed it for clarity, since that rule is only used for price and not other features.

biancadanforth · 2018-08-17T00:14:46Z

src/fathom_ruleset.js

+         * Image rules
+         *
+         * If training, leave uncommented, as 'price' and 'title' rules depend
+         * on the `out` of these 'image' rules.


I just checked this, and you're right that we don't need to comment out rules in a ruleset that we're not using to train a particular feature. Will remove those comments.

biancadanforth · 2018-08-20T17:58:32Z

@Osmose , ready for another look.

I created #61 as a follow-up issue as we discussed in Slack.

Osmose

Nice work, this is very close!

I dunno why some of my comments are marked as outdated. I was trying to comment on a subset of the diff. Let me know if they're actually outdated and I'll see about updating them.

Osmose · 2018-08-20T23:24:33Z

src/fathom_extraction.js


 const PRODUCT_FEATURES = ['title', 'price', 'image'];
-const SCORE_THRESHOLD = 4;
+// Array of numbers corresponding to the coefficients
+const coefficients = Object.values(defaultCoefficients);


The order of values returned by Object.values is implementation-specific and shouldn't be relied upon. We can't guarantee that the coefficients are returned in the same order every time.

We could:

Store an array in the JSON file.

Have a utility function that takes in the coefficient object and returns the coefficients in the correct order, perhaps as a static method on RulesetMaker.

Change RulesetFactory to take the coefficient object in as a parameter instead of the list. I'm not sure how we'd modify the trainees file though.

Okay I went for option two.

Osmose · 2018-08-20T23:27:38Z

src/fathom_extraction.js

-      isNearbyImageXAxisCoeff,
-      hasPriceishPatternCoeff,
-    ]).against(doc).get(`${feature}`);
+    let fnodesList = rules.against(doc).get(feature);


This is still running the ruleset against the document three separate times, but we only need to run it once and extract all three features, don't we?

Osmose · 2018-08-20T23:29:08Z

src/fathom_extraction.js

+      if (feature === 'image') {
+        extractedProduct[feature] = extractedElements[feature].src;
+      }
+      extractedProduct[feature] = extractedElements[feature].innerText;


This overwrites the value if the feature was image since the if statement is missing an else.

Osmose · 2018-08-20T23:30:22Z

src/ruleset_factory.js

+   */
+  constructor(coefficients) {
+    [
+      this.largerImageCoeff,


I was unaware you could use this in destructured arrays, huh!

Yeah it took me a while to figure out how I could use destructuring here.

Osmose · 2018-08-20T23:43:17Z

src/ruleset_factory.js

+    const upperHeightLimit = viewportHeight * 2;
+    // Use a falling trapezoid function to score the element
+    // Taken from: https://github.com/mozilla/fathom-trainees
+    if (top >= upperHeightLimit) {


The whole "falling trapezoid" function thing is still very confusing to me. I feel like comments describing what we're doing in context would make things easier:

// If the node is below the fold by more than a viewport's length, return a low score. if (top >= upperHeightLimit) { return ZEROISH * featureCoeff. } // If the node is above the fold, return a high score. if (top <= viewportHeight) { return ONEISH * featureCoeff; } // Otherwise, scale the score linearly between the fold and a viewport's length below it. const slope = (ONEISH - ZEROISH) / (viewportHeight - upperHeightLimit); return (slope * (top - upperHeightLimit) + ZEROISH) * featureCoeff;

Sorry if I'm being picky, I'm just trying to ensure I won't go blank next time I look at this. :P

Osmose · 2018-08-20T23:54:03Z

src/trainees.js

+  [
+    /**
+     * A ruleset that finds the main product image on a product page.
+     */


This doc comment seems unnecessary given that the ruleset has a key and it's clear from the rest of the file that these are rulesets. The ruleset name comment can probably go as well.

Osmose · 2018-08-20T23:54:31Z

src/trainees.js

+      coeffs,
+      /**
+      * @param {Array.number} coefficients
+      */


Probably also unnecessary to comment this function, the signature of these functions is explained above already.

Osmose · 2018-08-20T23:55:31Z

src/trainees.js

+        // The coefficients are updated over time during training, so create a new factory for
+        // each iteration
+        const rulesetFactory = new RulesetFactory(coefficients);
+        return rulesetFactory.makeRuleset(); // The ruleset


This comment doesn't really add anything either.

Osmose · 2018-08-20T23:56:44Z

src/trainees.js

+      /**
+      * @param {Array.number} coefficients
+      */
+      rulesetMaker(coefficients) {


This function doesn't change between the features, so we can define it once above and use it three times here.

Osmose · 2018-08-20T23:58:15Z

src/trainees.js

@@ -0,0 +1,101 @@
+/* This Source Code Form is subject to the terms of the Mozilla Public


This file is also extraction-related, and should be placed inside the extraction folder along with the other files.

@Osmose

These rules successfully pull out product title, price and image from the following product pages (one each from the 5 top sites): * [Amazon](https://www.amazon.com/KitchenAid-KL26M1XER-Professional-6-Qt-Bowl-Lift/dp/B01LYV1U30?smid=ATVPDKIKX0DER&pf_rd_p=0c7b792f-241a-4510-94f4-dd184a76f201&pf_rd_r=AZD7BGV3JZGTB23F30X3) * [Ebay](https://www.ebay.com/p/Best-Choice-Products-650W-6-speed-5-5QT-Kitchen-Food-Stand-Mixer-with-Stainless-Steels-Bowl-Black/3018375728?iid=253733404998) * [Walmart](https://www.walmart.com/ip/KitchenAid-Classic-Series-4-5-Quart-Tilt-Head-Stand-Mixer-Onyx-Black-K45SSOB/29474640) * [Best Buy](https://www.bestbuy.com/site/jbl-everest-elite-750nc-wireless-over-ear-noise-cancelling-headphones-gunmetal/5840136.p?skuId=5840136) * [Home Depot](https://www.homedepot.com/p/Husky-SAE-Combination-Wrench-Set-10-Piece-HCW10PCSAE/202934501) TODO: * Create a training set with FathomFox and run these rules against them to measure their accuracy for 50 product pages (10 from each top site). * Modify trimTitle method, so it doesn't cut off the color from the title for the product on Ebay. * Generalize formatPrice method. @Osmose, would you have any suggestions?

Product title rules previously pulled the unique 'title' element from the 'head' element on the page (part of the pages metadata). While this ostensibly requires less processing (we don't have to search the DOM or score any other elements), the title string often requires site-specific cleaning such as to remove the vendor name, and the final, cleaned up string cannot not be verified as accurate by Fathom, which only tells us if our rules picked the right element. The alternative approach, implemented here, is to pull the title from the corresponding element in the content of the page. Since Fathom can verify that the right element was selected, and the string from this element would not require any cleaning, this approach is a much better proxy for extracting the correct product title.

This enables the same script to be used for training and running in the commerce webextension. How to train a ruleset with Fathom: 1. Follow Fathom's [Trainer instructions](https://github.com/erikrose/fathom-fox#the-trainer). 2. Open the [Fathom Trainees](https://github.com/mozilla/fathom-trainees) add-on in a new profile. 3. Install FathomFox in that window from AMO. 4. Drag and drop the training corpus into that window. - Note: The training corpus are HTML files frozen using [FathomFox's DevTools panel](https://github.com/erikrose/fathom-fox#the-developer-tools-panel); our training corpus is on the shared "commerce" Google drive. - Note: As of the date of this commit, the Corpus Collector is not a recommended option for building a training corpus due to a `freeze-dry` dependency bug that inserts a bunch of extra garbage when re-freezing a frozen page. 5. Copy ./src/fathom_ruleset.js into fathom-trainees/src/trainees.js and save over it. 6. Choose a feature to train, 'price', 'title' or 'image', and edit `trainees.set()` so that one of those features is the first argument. 7. Comment out the rules pertaining to all but that feature. - Currently, you can only train one ruleset at a time with Fathom, and only one `out` (e.g. 'title', 'image' or 'product') at a time for a given ruleset. - If you have multiple `out`s you'd like to train simultaneously, repeat this process for the remaining features so Fathom is running in a separate browser window for each feature and its corresponding rules. 8. Click the FathomFox browserAction and select "Train" 9. Select the feature from the dropdown list and click the "Train against the tabs in this window" button. 10. The array of coefficients displayed on the training page will update over time as Fathom optimizes them; this could take a while.

@erikrose

These rules and coefficients yield the following accuracy based on a training corpus of 50 product pages from our top 5 sites (Amazon, Ebay, Walmart, Best Buy and Home Depot): * 100% for product 'image' * 96% for product 'title' * 94% for product 'price' Product 'price' and 'title' features have proximity rules based on the highest scoring product 'image' element. For now, this is done by accessing the image fnode using an internal '_ruleset' object; @erikrose is working on better support for this use case in the very near future, so this implementation can be improved at that time.

A follow-up commit will address [this comment](https://github.com/mozilla/webext-commerce/pull/45/files#r210363008) and [this comment](https://github.com/mozilla/webext-commerce/pull/45/files#r210361004).

The first script, 'ruleset_factory.js', exports a class to create a ruleset based on a set of coefficients; instances of this class are used in production (via 'fathom_extraction.js') and for Fathom training (via 'trainees.js'). 2. The second script, 'trainees.js', is used exclusively for training using the FathomFox web extension and does not ship with the commerce web extension. Additional changes and notes: * I chose not to make use of the 'autobind' decorator in 'ruleset_factory.js', since it is also used in the training add-on, where devDeps like 'babel-core' and 'babel-plugin-transform-decorators-legacy' do not exist. * I also turned off an eslint rule that requires class methods to use 'this', since some methods in RulesetFactory don't require it, and it would be tedious and confusing to call some methods on the class instance and others on the class itself. * The new training script ('trainees.js') has three elements in the map it exports, one for each product feature ('image', 'title', 'price'). This allows us to select which feature to train from a dropdown menu on FathomFox's trainer page. * Currently, for training, four files must be copied over into the 'fathom-trainees' add-on src directory: * config.js * fathom_default_coefficients.json * ruleset_factory.js * trainees.js (overwritting the existing file) * In a separate commit, I will put all the Fathom extraction files into an 'extraction' (or similar) subfolder.

biancadanforth · 2018-08-24T22:06:24Z

@Osmose Ready for you again. I replied to erikrose in the Fathom needs/aspirations doc re: the utils versus utilsForFrontend strangeness.

Osmose

r+wc, just one minor thing. Great work!

Osmose · 2018-08-25T01:56:13Z

src/extraction/trainees.js

+ * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
+
+/* eslint-disable import/no-unresolved */
+// This file is moved up a level to the ./src folder for training


Why can't we just remove extraction from the imports here? Then the file doesn't need to be moved.

The fathom-trainees add-on assumes trainees.js is dropped into the ./src folder. With a config.js dependency in the WebExtension, we would need to change some paths around (either the extraction subfolder or the path to config.js for training).

However, this comment made me realize my reason for using config.js was no longer necessary, so having removed it, I can put all the scripts needed for training into the top level ./src folder without this additional step! Thanks.

Also updated the Code Organization section of the README to include the new 'extraction' subfolder.

This const was not actually needed by more than one file, which simplifies how 'trainees.js' and its imported scripts are used for training.

biancadanforth mentioned this pull request Aug 12, 2018

Investigate Fathom-based page extraction #36

Closed

biancadanforth requested a review from Osmose August 13, 2018 17:47

erikrose reviewed Aug 14, 2018

View reviewed changes

biancadanforth mentioned this pull request Aug 14, 2018

Global extension storage, tracking items, and price updates #39

Merged

Osmose suggested changes Aug 15, 2018

View reviewed changes

biancadanforth commented Aug 17, 2018

View reviewed changes

biancadanforth changed the title ~~Fix #36: Add initial Fathom rules with 96.7% average training accuracy~~ Fix #36: Add initial Fathom rules with 98.7% average training accuracy Aug 17, 2018

biancadanforth force-pushed the 36-fathom-rules branch from f3c543a to bf9a919 Compare August 20, 2018 17:42

biancadanforth mentioned this pull request Aug 20, 2018

Webpack trainees.js with a separate config file. #61

Open

biancadanforth requested a review from Osmose August 20, 2018 20:23

Osmose suggested changes Aug 21, 2018

View reviewed changes

biancadanforth added 7 commits August 24, 2018 09:27

#36: Incorporate feedback from Osmose and erikrose.

8e0de88

A follow-up commit will address [this comment](https://github.com/mozilla/webext-commerce/pull/45/files#r210363008) and [this comment](https://github.com/mozilla/webext-commerce/pull/45/files#r210361004).

#36: Reorganize files into an 'extraction' subfolder.

4b1d448

biancadanforth force-pushed the 36-fathom-rules branch from bf9a919 to fdef992 Compare August 24, 2018 22:00

biancadanforth force-pushed the 36-fathom-rules branch from fdef992 to 3d1c904 Compare August 24, 2018 22:16

Osmose approved these changes Aug 25, 2018

View reviewed changes

#36: Incorporate more feedback from Osmose.

486142b

Also updated the Code Organization section of the README to include the new 'extraction' subfolder.

biancadanforth force-pushed the 36-fathom-rules branch 2 times, most recently from 5982fae to 60d4493 Compare August 27, 2018 18:43

Fix #36: Remove 'SCORE_THRESHOLD' const from config.js.

e3e4f15

This const was not actually needed by more than one file, which simplifies how 'trainees.js' and its imported scripts are used for training.

biancadanforth force-pushed the 36-fathom-rules branch from 60d4493 to e3e4f15 Compare August 27, 2018 18:45

biancadanforth merged commit 5c50172 into master Aug 27, 2018

biancadanforth deleted the 36-fathom-rules branch August 27, 2018 19:27

		@@ -0,0 +1,101 @@
		/* This Source Code Form is subject to the terms of the Mozilla Public

Fix #36: Add initial Fathom rules with 98.7% average training accuracy #45

Fix #36: Add initial Fathom rules with 98.7% average training accuracy #45

Conversation

biancadanforth commented Aug 12, 2018 • edited Loading

Training:

Coefficients and values:

Image - best accuracy: 100%

Price - best accuracy: 94%

Title - best accuracy: 100%

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Osmose left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

biancadanforth left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

biancadanforth commented Aug 20, 2018

Osmose left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

biancadanforth commented Aug 24, 2018

Osmose left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

biancadanforth Aug 27, 2018 • edited Loading

Choose a reason for hiding this comment

biancadanforth commented Aug 12, 2018 •

edited

Loading

biancadanforth left a comment •

edited

Loading

biancadanforth Aug 27, 2018 •

edited

Loading